Controlling View Divergence of Data Freshness in a Replicated Database System Using Statistical Update Delay Estimation

نویسندگان

  • Takao Yamashita
  • Satoshi Ono
چکیده

We propose a method of controlling the view divergence of data freshness when copies of sites in a replicated database are updated asynchronously. The view divergence of the replicated data freshness is the difference in the recentness of the updates reflected in the data acquired by clients. Our method accesses multiple sites and provides a client with data that reflects all the updates received by the sites. First, we define the probabilistic recentness of updates reflected in acquired data as read data freshness (RDF). The degree of RDF of data acquired by clients is the range of view divergence. Second, we propose a way to select sites in a replicated database by using the probability distribution of the update delays so that the data acquired by a client satisfies its required RDF. This way calculates the minimum number of sites in order to reduce the overhead of read transactions. Our method continues to adaptively and reliably provide data that meets the client’s requirements in an environment where the delay of update propagation varies and applications’ requirements change depending on the situation. Finally, we evaluate by simulation the view divergence we can control using our method. The simulation showed that our method can control the view divergence to about 1/4 that of a normal read transaction for 100 replicas. In addition, the increase in the overhead of a read transaction imposed by our method is not as much as the increase in the total number of replicas. key words: information dissemination, update delay distribution, nonparametric estimation, asynchronous update, lazy replication

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Quality Management in a Database Cluster with Lazy Replication

Abstract We consider the use of a database cluster with lazy replication. In this context, controlling the quality of replicated data based on users’ requirements is important to improve performance. However, existing approaches are limited to a particular aspect of data quality. In this paper, we propose a general model of data quality which makes the difference between “freshness” and “validi...

متن کامل

Transaction Routing with Freshness Control in a Cluster of Replicated Databases

We consider the use of a cluster system with a shared nothing architecture for update-intensive autonomous databases. To optimize load balancing, we use optimistic database replication with freshness control. We propose a solution to transaction routing that preserves database and application autonomy and a cost model to estimate replica freshness. Then we propose an algorithm for transaction r...

متن کامل

Minimizing Content Staleness in Dynamo-Style Replicated Storage Systems

Consistency in data storage systems requires any read operation to return the most recent written version of the content. In replicated storage systems, consistency comes at the price of delay due to large-scale write and read operations. Many applications with low latency requirements tolerate data staleness in order to provide high availability and low operation latency. Using age of informat...

متن کامل

Fraîcheur et validité de données répliquées dans des environnements transactionnels

We propose a framework for managing the quality of data replicated optimistically on a database cluster. It is based on a model of quality of data. Qualitalively, we make the difference between “freshness” and “validity” of data. Quantitatively, data quality is expressed through divergence measures between the data read and the same data with perfect quality. Users specify a minimum level of qu...

متن کامل

Update propagation strategies to improve data freshness in lazy master scheme

Many distributed database applications need to replicate data to improve data availability and query response time. The two-phase-commit protocol guarantees mutual consistency of replicated data but does not provide good performance. Lazy replication has been used as an alternative solution in several types of applications on-line nancial transactions and telecommunication systems. In this case...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEICE Transactions

دوره 88-D  شماره 

صفحات  -

تاریخ انتشار 2005